Given a string s and an integer k, return the length of the longest substring of s that contains at most k distinct characters.
Example 1:
Input: s = "eceba", k = 2 Output: 3 Explanation: The substring is "ece" with length 3.
Example 2:
Input: s = "aa", k = 1 Output: 2 Explanation: The substring is "aa" with length 2.
Constraints:
1 <= s.length <= 5 * 1040 <= k <= 50
Average Rating: 3.92 (73 votes)
Video Solution
Solution Article
Approach 1: Sliding Window + Hashmap.
Intuition
We could take some inspiration from a simpler problem called longest substring with at most two distinct characters.
To solve the problem in one pass
let's use here sliding window approach with two set pointers
left and right serving as the window boundaries.
The idea is to set both pointers in the position 0 and
then move right pointer to the right while the
window contains not more than k distinct characters.
If at some point we've got k + 1 distinct characters,
let's move left pointer to keep not more than k + 1
distinct characters in the window.
Basically that's the algorithm : to move sliding window along the string,
to keep not more than k distinct characters in the window, and
to update max substring length at each step.
There is just one more question to reply - how to move the left pointer to keep only
kdistinct characters in the string?
Let's use for this purpose hashmap containing all characters
in the sliding window as keys and their rightmost positions
as values. At each moment, this hashmap could contain
not more than k + 1 elements.
For example, using this hashmap one knows that the rightmost position
of character O in "LOVELEE" window is 1 and so one has
to move left pointer in the position 1 + 1 = 2 to
exclude the character O from the sliding window.
Algorithm
Now one could write down the algortihm.
- Return
0if the string is empty orkis equal to zero. - Set both set pointers in the beginning
of the string
left = 0andright = 0and init max substring lengthmax_len = 1. - While
rightpointer is less thanN:- Add the current character
s[right]in the hashmap and moverightpointer to the right. - If hashmap contains
k + 1distinct characters, remove the leftmost character from the hashmap and move theleftpointer so that sliding window contains againkdistinct characters only. - Update
max_len.
- Add the current character
Implementation
Complexity Analysis
Do we have here the best possible time complexity O(N) as it was for more simple problem with at most two distinct characters?
For the best case when input string contains not more than
k distinct characters the answer is yes.
It's the only one pass along the string with
N characters and the time complexity is O(N).
For the worst case when the input string contains
n distinct characters, the answer is no. In that case at each
step one uses O(k) time to find a minimum value
in the hashmap with k elements and so the overall time
complexity is O(Nk).
-
Time complexity : O(N) in the best case of
kdistinct characters in the string and O(Nk) in the worst case ofNdistinct characters in the string. -
Space complexity : O(k) since additional space is used only for a hashmap with at most
k + 1elements.
Approach 2: Sliding Window + Ordered Dictionary.
How to achieve O(N) time complexity
Approach 1 with a standard hashmap couldn't ensure O(N) time complexity.
To have O(N) algorithm performance, one would need a structure, which provides four operations in O(1) time :
-
Insert the key
-
Get the key and check if the key exists
-
Delete the key
-
Return the first or last added key/ value
The first three operations in O(1) time are provided by the standard hashmap, and the forth one - by linked list.
There is a structure called ordered dictionary, it combines behind both hashmap and linked list. In Python this structure is called OrderedDict and in Java LinkedHashMap.
Ordered dictionary is quite popular for interviews. for example, check out the Implementing a LRU Cache question by Google.
Algorithm
Let's use ordered dictionary instead of standard hashmap to trim the algorithm from approach 1 :
- Return
0if the string is empty orkis equal to zero. - Set both pointers to the beginning
of the string
left = 0andright = 0and initialize max substring lengthmax_len = 1. - While
rightpointer is less thanN:- If the current character
s[right]is already in the ordered dictionaryhashmap-- delete it, to ensure that the first key inhashmapis the leftmost character. - Add the current character
s[right]in the ordered dictionary and moverightpointer to the right. - If ordered dictionary
hashmapcontainsk + 1distinct characters, remove the leftmost one and move theleftpointer so that sliding window contains againkdistinct characters only. - Update
max_len.
- If the current character
Implementation
Complexity Analysis
-
Time complexity : O(N) since all operations with ordered dictionary :
insert/get/delete/popitem(put/containsKey/remove) are done in a constant time. -
Space complexity : O(k) since additional space is used only for an ordered dictionary with at most
k + 1elements.
October 29, 2019 7:46 PM
don't store the index, instead, store the count of each character, once right point sees a new char (count==0), we increase the number of distinct character by 1, left pointer keeps decreasing the count, once count reaches 0, we decrease it. ordered dict is totally unnecessary
from collections import Counter
class Solution(object):
def lengthOfLongestSubstringKDistinct(self, s, k):
counter = Counter()
res, i, n = 0, 0, 0
for j, c in enumerate(s):
if counter[c] == 0: # never seen before
n += 1
counter[c] += 1
while n > k:
counter[s[i]] -= 1
if counter[s[i]] == 0:
n -= 1
i += 1
res = max(res, j - i + 1)
return res
February 25, 2019 8:10 AM
I don't get why solution 1's time complexity is not just O(N). You have 2 pointers. Right pointer takes n steps. Left pointer can take at most n steps because it's always trailing right. You can argue the time complexity is O(2N), which is O(N), but you cannot say it's O(Nk).
June 10, 2020 8:03 AM
Couldn't you store the frequency of each character in the map instead of its rightMost position? That way, when the size of the map exceeds k, you can find the character at the left pointer in the map, and decrement its frequency by 1. Then increment the left pointer (contracting the window). Keep doing this until a character's frequency reaches 0, and remove that character from the map. Wouldn't this achieve O(n) complexity?
I think my solution is O(n) in any worst case. Let me know if you think it is not. Will appreciate constructive criticism.
class Solution(object):
def lengthOfLongestSubstringKDistinct(self, s, k):
d = {}
i = j = max_len = 0
while j < len(s):
d[s[j]] = d.get(s[j], 0) + 1
if len(d) > k:
d[s[i]] -= 1
if d[s[i]] == 0:
del d[s[i]]
i += 1
max_len = max(max_len, j - i + 1)
j += 1
return max_len
February 22, 2019 4:18 PM
Great explanation, Thanks.
Last Edit: May 2, 2019 6:08 PM
For solution 2, if you used access-ordered LInkedHashMap, it could save you some time from reinserting entries every time (line 17 -21). The constructor looks like this:
Map<Character, Integer> map = new LinkedHashMap<>(k + 1, 1, true);
July 15, 2020 11:05 AM
Both solutions are overcomplicated. Simple hashtable with counts does the job
class Solution {
public:
int lengthOfLongestSubstringKDistinct(const string& s, int k) {
if(k < 1)
return 0;
unordered_map<char, int> stats;
int start = 0, end = 0;
int maxLen = 0;
for(; end < s.size(); ++end) {
++stats[s[end]];
while(stats.size() > k) {
if(--stats[s[start]] == 0) {
stats.erase(s[start]);
}
++start;
}
maxLen = max(maxLen, end-start+1);
}
return maxLen;
}
};
March 16, 2019 8:50 AM
You can definitely reach O(N) complexity with sliding window with two pointers, please refer to the following code.
public int lengthOfLongestSubstringKDistinct(String s, int k) {
Integer r=0,l=0,res=0,curDistinct=0;
int[] freq = new int[256];
while(r<s.length()){
if(++freq[s.charAt(r++)]==1) curDistinct++;
while(curDistinct>k) if(--freq[s.charAt(l++)]==0) curDistinct--;
res = Math.max(r-l, res);
}
return res;
}
You don't have any submissions yet.
xxxxxxxxxxclass Solution {public: int lengthOfLongestSubstringKDistinct(string s, int k) { }};

